<!DOCTYPE html>
<html>
<head>
    

    

    



    <meta charset="utf-8">
    
    
    <link rel="canonical" href="http://xiejm.com/Hive/Hive--UDF.html">
    
    
    <title>Hive--UDF开发 | XieJM&#39;s Blog | 建立博客是为了记录工作经验以及生活点滴,也是将知识和经验分享给需要的朋友，希望对你有帮助！</title>
    <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">
    
    <meta name="theme-color" content="#00bcd4">
    
    
    <meta name="keywords" content="Hive">
    <meta name="description" content="前言Hive中，除了提供丰富的内置函数之外还可以通过实现用户定义函数（User-Defined Functions，UDF）进行扩展（事实上，大多数Hive功能都是通过扩展UDF实现的）。 开发自定义UDF函数有两种方式，一个是继承org.apache.hadoop.hive.ql.exec.UDF，另一个是继承org.apache.hadoop.hive.ql.udf.generic.Gener">
<meta name="keywords" content="Hive">
<meta property="og:type" content="article">
<meta property="og:title" content="Hive--UDF开发">
<meta property="og:url" content="http://xiejm.com/Hive/Hive--UDF.html">
<meta property="og:site_name" content="XieJM&#39;s Blog">
<meta property="og:description" content="前言Hive中，除了提供丰富的内置函数之外还可以通过实现用户定义函数（User-Defined Functions，UDF）进行扩展（事实上，大多数Hive功能都是通过扩展UDF实现的）。 开发自定义UDF函数有两种方式，一个是继承org.apache.hadoop.hive.ql.exec.UDF，另一个是继承org.apache.hadoop.hive.ql.udf.generic.Gener">
<meta property="og:locale" content="zh-CN">
<meta property="og:updated_time" content="2017-10-08T06:39:23.586Z">
<meta name="twitter:card" content="summary">
<meta name="twitter:title" content="Hive--UDF开发">
<meta name="twitter:description" content="前言Hive中，除了提供丰富的内置函数之外还可以通过实现用户定义函数（User-Defined Functions，UDF）进行扩展（事实上，大多数Hive功能都是通过扩展UDF实现的）。 开发自定义UDF函数有两种方式，一个是继承org.apache.hadoop.hive.ql.exec.UDF，另一个是继承org.apache.hadoop.hive.ql.udf.generic.Gener">
    
        <link rel="alternate" type="application/atom+xml" title="XieJM&#39;s Blog" href="/atom.xml">
    
    <link rel="shortcut icon" href="/favicon.ico">
    <link rel="stylesheet" href="/css/style.css?v=1.6.13">
    <script>window.lazyScripts=[]</script>

    <!-- custom head -->
    

</head>

<body>
    <div id="loading" class="active"></div>

    <aside id="menu" class="hide" >
  <div class="inner flex-row-vertical">
    <a href="javascript:;" class="header-icon waves-effect waves-circle waves-light" id="menu-off">
        <i class="icon icon-lg icon-close"></i>
    </a>
    <div class="brand-wrap" style="background-image:url(/img/brand.jpg)">
      <div class="brand">
        <a href="/" class="avatar waves-effect waves-circle waves-light">
          <img src="/img/avatar.jpg">
        </a>
        <hgroup class="introduce">
          <h5 class="nickname">XieJM</h5>
          <a href="mailto:309469843@qq.com" title="309469843@qq.com" class="mail">309469843@qq.com</a>
        </hgroup>
      </div>
    </div>
    <div class="scroll-wrap flex-col">
      <ul class="nav">
        
            <li class="waves-block waves-effect">
              <a href="/index.html"  >
                <i class="icon icon-lg icon-home"></i>
                主页
              </a>
            </li>
        
            <li class="waves-block waves-effect">
              <a href="/archives/index.html"  >
                <i class="icon icon-lg icon-archives"></i>
                归档
              </a>
            </li>
        
            <li class="waves-block waves-effect">
              <a href="/tags/index.html"  >
                <i class="icon icon-lg icon-tags"></i>
                标签
              </a>
            </li>
        
            <li class="waves-block waves-effect">
              <a href="/categories/index.html"  >
                <i class="icon icon-lg icon-th-list"></i>
                分类
              </a>
            </li>
        
            <li class="waves-block waves-effect">
              <a href="https://github.com/xjmhz" target="_blank" >
                <i class="icon icon-lg icon-github"></i>
                Github
              </a>
            </li>
        
      </ul>      
    </div>
    <footer class="footer">
    <p>欢迎加入我们的大数据交流群：<br>群1：258669058 群2：126181630</p>   
    <p>        
        <span><a rel="license" href="https://creativecommons.org/licenses/by-nc-sa/4.0/deed.zh"><img src="/img/cc.png"></a></span>
        
        <span><a href="/atom.xml" target="_blank" class="rss" title="rss"><i class="icon icon-2x icon-rss-square"></i></a></span>
        
    </p>
    <p><span>XieJM &copy; 2017</span>
    </p>
    <p><span>
            
            Power by <a href="http://hexo.io/" target="_blank">Hexo</a> Theme <a href="https://github.com/yscoder/hexo-theme-indigo" target="_blank">indigo</a>
        </span>
    </p>
</footer>
  </div>
</aside>

    <main id="main">
        <header class="top-header" id="header">
    <div class="flex-row">
        <a href="javascript:;" class="header-icon waves-effect waves-circle waves-light on" id="menu-toggle">
          <i class="icon icon-lg icon-navicon"></i>
        </a>
        <div class="flex-col header-title ellipsis">Hive--UDF开发</div>
        
        <div class="search-wrap" id="search-wrap">
            <a href="javascript:;" class="header-icon waves-effect waves-circle waves-light" id="back">
                <i class="icon icon-lg icon-chevron-left"></i>
            </a>
            <input type="text" id="key" class="search-input" autocomplete="off" placeholder="输入感兴趣的关键字">
            <a href="javascript:;" class="header-icon waves-effect waves-circle waves-light" id="search">
                <i class="icon icon-lg icon-search"></i>
            </a>
        </div>
        
        
        <a href="javascript:;" class="header-icon waves-effect waves-circle waves-light" id="menuShare">
            <i class="icon icon-lg icon-share-alt"></i>
        </a>
        
    </div>
</header>
<header class="content-header post-header">

    <div class="container fade-scale">
        <h1 class="title">Hive--UDF开发</h1>
        <h5 class="subtitle">
            
                <time datetime="2017-10-02T15:47:00.476Z" itemprop="datePublished" class="page-time">
  2017-10-02
</time>


	<ul class="article-category-list"><li class="article-category-list-item"><a class="article-category-list-link" href="/categories/tech/">技术</a></li></ul>

            
        </h5>
    </div>

    


</header>


<div class="container body-wrap">
    
    <aside class="post-widget">
        <nav class="post-toc-wrap" id="post-toc">
            <h4>TOC</h4>
            <ol class="post-toc"><li class="post-toc-item post-toc-level-1"><a class="post-toc-link" href="#前言"><span class="post-toc-text">前言</span></a><ol class="post-toc-child"><li class="post-toc-item post-toc-level-2"><a class="post-toc-link" href="#环境"><span class="post-toc-text">环境</span></a></li><li class="post-toc-item post-toc-level-2"><a class="post-toc-link" href="#编写UDF类"><span class="post-toc-text">编写UDF类</span></a></li><li class="post-toc-item post-toc-level-2"><a class="post-toc-link" href="#进阶：将自定义UDF编译到hive源码中"><span class="post-toc-text">进阶：将自定义UDF编译到hive源码中</span></a></li></ol></li></ol>
        </nav>
    </aside>
    
<article id="post-Hive/Hive--UDF"
  class="post-article article-type-post fade" itemprop="blogPost">

    <div class="post-card">
        <h1 class="post-card-title">Hive--UDF开发</h1>
        <div class="post-meta">
            <time class="post-time" title="2017-10-02 23:47:00" datetime="2017-10-02T15:47:00.476Z"  itemprop="datePublished">2017-10-02</time>

            
	<ul class="article-category-list"><li class="article-category-list-item"><a class="article-category-list-link" href="/categories/tech/">技术</a></li></ul>



            
<span id="busuanzi_container_page_pv" title="文章总阅读量" style='display:none'>
    <i class="icon icon-eye icon-pr"></i><span id="busuanzi_value_page_pv"></span>
</span>


        </div>
        <div class="post-content" id="post-content" itemprop="postContent">
            <h1 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h1><p>Hive中，除了提供丰富的内置函数之外还可以通过实现用户定义函数（User-Defined Functions，UDF）进行扩展（事实上，大多数Hive功能都是通过扩展UDF实现的）。</p>
<p>开发自定义UDF函数有两种方式，一个是继承<code>org.apache.hadoop.hive.ql.exec.UDF</code>，另一个是继承<code>org.apache.hadoop.hive.ql.udf.generic.GenericUDF</code>；并重载<code>evaluate</code>方法。Hive API提供@Description声明，使用声明可以在代码中添加UDF的具体信息。在Hive中可以使用<code>DESC</code>、<code>DESCRIBE</code>来展现这些信息。</p>
<p>如果是针对简单的数据类型（比如String、Integer等）可以使用UDF，如果是针对复杂的数据类型（比如Array、Map、Struct等），可以使用GenericUDF，另外，GenericUDF还可以在函数开始之前和结束之后做一些初始化和关闭的处理操作。</p>
<p>Hive的源码本身就是编写UDF最好的参考资料。在Hive源代码中很容易就能找到与需求功能相似的UDF实现，只需要复制过来，并加以适当的修改就可以满足需求。</p>
<h2 id="环境"><a href="#环境" class="headerlink" title="环境"></a>环境</h2><ul>
<li>CentOS6.X</li>
<li>JDK1.8</li>
<li>Hadoop2.6.0-cdh5.7.0</li>
<li>Hive-1.1.0-cdh5.7.0</li>
<li>Maven3.3.9</li>
</ul>
<h2 id="编写UDF类"><a href="#编写UDF类" class="headerlink" title="编写UDF类"></a>编写UDF类</h2><p>1.首先在IDEA中新建一个maven项目<br>2.pom.xml中添加相关依赖</p>
<figure class="highlight xml"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div><div class="line">27</div><div class="line">28</div><div class="line">29</div><div class="line">30</div><div class="line">31</div><div class="line">32</div><div class="line">33</div><div class="line">34</div><div class="line">35</div><div class="line">36</div><div class="line">37</div><div class="line">38</div><div class="line">39</div><div class="line">40</div><div class="line">41</div><div class="line">42</div><div class="line">43</div><div class="line">44</div><div class="line">45</div><div class="line">46</div><div class="line">47</div><div class="line">48</div><div class="line">49</div><div class="line">50</div><div class="line">51</div><div class="line">52</div><div class="line">53</div><div class="line">54</div><div class="line">55</div><div class="line">56</div><div class="line">57</div><div class="line">58</div><div class="line">59</div><div class="line">60</div><div class="line">61</div><div class="line">62</div><div class="line">63</div><div class="line">64</div><div class="line">65</div><div class="line">66</div><div class="line">67</div><div class="line">68</div><div class="line">69</div><div class="line">70</div><div class="line">71</div><div class="line">72</div><div class="line">73</div><div class="line">74</div></pre></td><td class="code"><pre><div class="line"><span class="tag">&lt;<span class="name">project</span> <span class="attr">xmlns</span>=<span class="string">"http://maven.apache.org/POM/4.0.0"</span> <span class="attr">xmlns:xsi</span>=<span class="string">"http://www.w3.org/2001/XMLSchema-instance"</span></span></div><div class="line"><span class="tag">  <span class="attr">xsi:schemaLocation</span>=<span class="string">"http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"</span>&gt;</span></div><div class="line">  <span class="tag">&lt;<span class="name">modelVersion</span>&gt;</span>4.0.0<span class="tag">&lt;/<span class="name">modelVersion</span>&gt;</span></div><div class="line"></div><div class="line">  <span class="tag">&lt;<span class="name">groupId</span>&gt;</span>com.xiejm.bigdata<span class="tag">&lt;/<span class="name">groupId</span>&gt;</span></div><div class="line">  <span class="tag">&lt;<span class="name">artifactId</span>&gt;</span>hive-train<span class="tag">&lt;/<span class="name">artifactId</span>&gt;</span></div><div class="line">  <span class="tag">&lt;<span class="name">version</span>&gt;</span>1.0<span class="tag">&lt;/<span class="name">version</span>&gt;</span></div><div class="line">  <span class="tag">&lt;<span class="name">packaging</span>&gt;</span>jar<span class="tag">&lt;/<span class="name">packaging</span>&gt;</span></div><div class="line"></div><div class="line">  <span class="tag">&lt;<span class="name">name</span>&gt;</span>hive-train<span class="tag">&lt;/<span class="name">name</span>&gt;</span></div><div class="line">  <span class="tag">&lt;<span class="name">url</span>&gt;</span>http://maven.apache.org<span class="tag">&lt;/<span class="name">url</span>&gt;</span></div><div class="line"></div><div class="line">  <span class="tag">&lt;<span class="name">properties</span>&gt;</span></div><div class="line">    <span class="tag">&lt;<span class="name">project.build.sourceEncoding</span>&gt;</span>UTF-8<span class="tag">&lt;/<span class="name">project.build.sourceEncoding</span>&gt;</span></div><div class="line">    <span class="tag">&lt;<span class="name">hadoop.version</span>&gt;</span>2.6.0-cdh5.7.0<span class="tag">&lt;/<span class="name">hadoop.version</span>&gt;</span></div><div class="line">    <span class="tag">&lt;<span class="name">hive.version</span>&gt;</span>1.1.0-cdh5.7.0<span class="tag">&lt;/<span class="name">hive.version</span>&gt;</span></div><div class="line">  <span class="tag">&lt;/<span class="name">properties</span>&gt;</span></div><div class="line"></div><div class="line">  <span class="tag">&lt;<span class="name">repositories</span>&gt;</span></div><div class="line">    <span class="tag">&lt;<span class="name">repository</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">id</span>&gt;</span>cloudera<span class="tag">&lt;/<span class="name">id</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">url</span>&gt;</span> https://repository.cloudera.com/artifactory/cloudera-repos/<span class="tag">&lt;/<span class="name">url</span>&gt;</span></div><div class="line">    <span class="tag">&lt;/<span class="name">repository</span>&gt;</span></div><div class="line">  <span class="tag">&lt;/<span class="name">repositories</span>&gt;</span></div><div class="line"></div><div class="line">  <span class="tag">&lt;<span class="name">dependencies</span>&gt;</span></div><div class="line">    <span class="tag">&lt;<span class="name">dependency</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">groupId</span>&gt;</span>org.apache.hadoop<span class="tag">&lt;/<span class="name">groupId</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">artifactId</span>&gt;</span>hadoop-common<span class="tag">&lt;/<span class="name">artifactId</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">version</span>&gt;</span>$&#123;hadoop.version&#125;<span class="tag">&lt;/<span class="name">version</span>&gt;</span></div><div class="line">    <span class="tag">&lt;/<span class="name">dependency</span>&gt;</span></div><div class="line">    <span class="comment">&lt;!--cdh的hive依赖--&gt;</span></div><div class="line">    <span class="tag">&lt;<span class="name">dependency</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">groupId</span>&gt;</span>org.apache.hive<span class="tag">&lt;/<span class="name">groupId</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">artifactId</span>&gt;</span>hive-accumulo-handler<span class="tag">&lt;/<span class="name">artifactId</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">version</span>&gt;</span>$&#123;hive.version&#125;<span class="tag">&lt;/<span class="name">version</span>&gt;</span></div><div class="line">    <span class="tag">&lt;/<span class="name">dependency</span>&gt;</span></div><div class="line">    <span class="tag">&lt;<span class="name">dependency</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">groupId</span>&gt;</span>org.apache.hive<span class="tag">&lt;/<span class="name">groupId</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">artifactId</span>&gt;</span>hive-ant<span class="tag">&lt;/<span class="name">artifactId</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">version</span>&gt;</span>$&#123;hive.version&#125;<span class="tag">&lt;/<span class="name">version</span>&gt;</span></div><div class="line">    <span class="tag">&lt;/<span class="name">dependency</span>&gt;</span></div><div class="line">    <span class="tag">&lt;<span class="name">dependency</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">groupId</span>&gt;</span>org.apache.hive<span class="tag">&lt;/<span class="name">groupId</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">artifactId</span>&gt;</span>hive-beeline<span class="tag">&lt;/<span class="name">artifactId</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">version</span>&gt;</span>$&#123;hive.version&#125;<span class="tag">&lt;/<span class="name">version</span>&gt;</span></div><div class="line">    <span class="tag">&lt;/<span class="name">dependency</span>&gt;</span>    <span class="tag">&lt;<span class="name">dependency</span>&gt;</span></div><div class="line">    <span class="tag">&lt;<span class="name">groupId</span>&gt;</span>org.apache.hive<span class="tag">&lt;/<span class="name">groupId</span>&gt;</span></div><div class="line">    <span class="tag">&lt;<span class="name">artifactId</span>&gt;</span>hive-cli<span class="tag">&lt;/<span class="name">artifactId</span>&gt;</span></div><div class="line">    <span class="tag">&lt;<span class="name">version</span>&gt;</span>$&#123;hive.version&#125;<span class="tag">&lt;/<span class="name">version</span>&gt;</span></div><div class="line">  <span class="tag">&lt;/<span class="name">dependency</span>&gt;</span></div><div class="line">    <span class="tag">&lt;<span class="name">dependency</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">groupId</span>&gt;</span>org.apache.hive<span class="tag">&lt;/<span class="name">groupId</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">artifactId</span>&gt;</span>hive-common<span class="tag">&lt;/<span class="name">artifactId</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">version</span>&gt;</span>$&#123;hive.version&#125;<span class="tag">&lt;/<span class="name">version</span>&gt;</span></div><div class="line">    <span class="tag">&lt;/<span class="name">dependency</span>&gt;</span></div><div class="line">    <span class="tag">&lt;<span class="name">dependency</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">groupId</span>&gt;</span>org.apache.hive<span class="tag">&lt;/<span class="name">groupId</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">artifactId</span>&gt;</span>hive-contrib<span class="tag">&lt;/<span class="name">artifactId</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">version</span>&gt;</span>$&#123;hive.version&#125;<span class="tag">&lt;/<span class="name">version</span>&gt;</span></div><div class="line">    <span class="tag">&lt;/<span class="name">dependency</span>&gt;</span></div><div class="line">    <span class="tag">&lt;<span class="name">dependency</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">groupId</span>&gt;</span>org.apache.hive<span class="tag">&lt;/<span class="name">groupId</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">artifactId</span>&gt;</span>hive-exec<span class="tag">&lt;/<span class="name">artifactId</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">version</span>&gt;</span>$&#123;hive.version&#125;<span class="tag">&lt;/<span class="name">version</span>&gt;</span></div><div class="line">    <span class="tag">&lt;/<span class="name">dependency</span>&gt;</span></div><div class="line">    <span class="tag">&lt;<span class="name">dependency</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">groupId</span>&gt;</span>junit<span class="tag">&lt;/<span class="name">groupId</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">artifactId</span>&gt;</span>junit<span class="tag">&lt;/<span class="name">artifactId</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">version</span>&gt;</span>4.10<span class="tag">&lt;/<span class="name">version</span>&gt;</span></div><div class="line">      <span class="tag">&lt;<span class="name">scope</span>&gt;</span>test<span class="tag">&lt;/<span class="name">scope</span>&gt;</span></div><div class="line">    <span class="tag">&lt;/<span class="name">dependency</span>&gt;</span></div><div class="line">  <span class="tag">&lt;/<span class="name">dependencies</span>&gt;</span></div><div class="line"><span class="tag">&lt;/<span class="name">project</span>&gt;</span></div></pre></td></tr></table></figure>
<p>3.新建一个Java Class，命名为ToLower</p>
<figure class="highlight java"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div><div class="line">27</div></pre></td><td class="code"><pre><div class="line"><span class="keyword">package</span> com.xiejm.bigdata.hive;</div><div class="line"></div><div class="line"><span class="keyword">import</span> org.apache.hadoop.io.Text;</div><div class="line"><span class="keyword">import</span> org.apache.hadoop.hive.ql.exec.Description;</div><div class="line"><span class="keyword">import</span> org.apache.hadoop.hive.ql.exec.UDF;</div><div class="line"></div><div class="line"><span class="comment">/**</span></div><div class="line"><span class="comment"> * Created by jimmy on 2017/9/29.</span></div><div class="line"><span class="comment"> * 这个函数是将字符串全部转化为大写字母</span></div><div class="line"><span class="comment"> * add jar samplecode.jar;</span></div><div class="line"><span class="comment"> * create temporary function toupper as 'com.xiejm.bigdata.hive.ToUpper';</span></div><div class="line"><span class="comment"> */</span></div><div class="line"></div><div class="line"><span class="meta">@Description</span>(name = <span class="string">"ToUpper"</span>,</div><div class="line">        value = <span class="string">"_FUNC_(str) - Converts a string to uppercase"</span>,</div><div class="line">        extended = <span class="string">"Example:\n "</span></div><div class="line">                + <span class="string">"  &gt; SELECT _FUNC_(str) FROM src LIMIT 1;\n"</span>)</div><div class="line"></div><div class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">class</span> <span class="title">ToUpper</span> <span class="keyword">extends</span> <span class="title">UDF</span> </span>&#123;</div><div class="line">    <span class="function"><span class="keyword">public</span> Text <span class="title">evaluate</span><span class="params">(Text input)</span> </span>&#123;</div><div class="line">        Text result = <span class="keyword">new</span> Text(<span class="string">""</span>);</div><div class="line">        <span class="keyword">if</span> (input != <span class="keyword">null</span>) &#123;</div><div class="line">            result.set(input.toString().toUpperCase());</div><div class="line">        &#125;</div><div class="line">        <span class="keyword">return</span> result;</div><div class="line">    &#125;</div><div class="line">&#125;</div></pre></td></tr></table></figure>
<p>4.在IDEA中编译该项目，将编译后的jar包上传到Linux中<br>5.hive中加载jar包</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div></pre></td><td class="code"><pre><div class="line"><span class="comment">## 添加UDF jar包</span></div><div class="line">hive (default)&gt; add jar /root/hive-train-1.0.jar</div><div class="line">Added [/root/hive-train-1.0.jar] to class path</div><div class="line">Added resources: [/root/hive-train-1.0.jar]</div><div class="line"></div><div class="line"><span class="comment">## 创建UDF函数</span></div><div class="line">hive (default)&gt; create temporary <span class="keyword">function</span> toupper as <span class="string">'com.xiejm.bigdata.hive.ToUpper'</span>;</div><div class="line">OK</div><div class="line">Time taken: 0.005 seconds</div><div class="line"></div><div class="line"><span class="comment">##查看UDF函数，显示函数注释</span></div><div class="line">hive (default)&gt; desc <span class="keyword">function</span> extended toupper;</div><div class="line">OK</div><div class="line">toupper(str) - Converts a string to uppercase</div><div class="line">Example:</div><div class="line">   &gt; SELECT toupper(str) FROM src LIMIT 1;</div><div class="line"></div><div class="line">Time taken: 0.007 seconds, Fetched: 4 row(s)</div><div class="line"></div><div class="line">hive (default)&gt; <span class="built_in">exit</span>;</div></pre></td></tr></table></figure>
<p>上面的方法介绍了如何添加临时的UDF函数，那么如果要永久加载这个函数，我们该怎么做呢？ </p>
<p>步骤和上面的方法是一样的，只需把CREATE语句中的TEMPORARY关键字删除即可。</p>
<p>当不需要这个函数了，可以把这个函数从Hive中删除,<br>使用下面的语法格式</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><div class="line">1</div></pre></td><td class="code"><pre><div class="line">DROP [TEMPORARY] FUNCTION function_name</div></pre></td></tr></table></figure>
<h2 id="进阶：将自定义UDF编译到hive源码中"><a href="#进阶：将自定义UDF编译到hive源码中" class="headerlink" title="进阶：将自定义UDF编译到hive源码中"></a>进阶：将自定义UDF编译到hive源码中</h2><p>上面的方法每次使用都要add,create一下，还是很麻烦。那么问题来了：如何把UDF集成到Hive的源码中？</p>
<p><strong>步骤如下：</strong></p>
<p>1.下载hive1.1.0-cdh5.7.0的源码包<br>2.在Linux 上解压到/root目录<br>3.将编写好的 UDF 类文件ToUpper.java复制到Hive源码目录中,并修改</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div></pre></td><td class="code"><pre><div class="line">cp ToUpper.java /root/hive-1.1.0-cdh5.7.0/ql/src/java/org/apache/hadoop/hive/ql/udf</div><div class="line"></div><div class="line">vim /root/hive-1.1.0-cdh5.7.0/ql/src/java/org/apache/hadoop/hive/ql/udf/ToUpper.java</div><div class="line"><span class="comment">##将 package com.xiejm.bigdata.hello; 修改为 package org.apache.hadoop.hive.ql.udf;</span></div></pre></td></tr></table></figure>
<p>4.修改FunctionRegistry.java 文件</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><div class="line">1</div></pre></td><td class="code"><pre><div class="line">vim /root/hive-1.1.0-cdh5.7.0/ql/src/java/org/apache/hadoop/hive/ql/<span class="built_in">exec</span>/FunctionRegistry.java</div></pre></td></tr></table></figure>
<p>添加import</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><div class="line">1</div></pre></td><td class="code"><pre><div class="line">import org.apache.hadoop.hive.ql.udf.ToUpper;</div></pre></td></tr></table></figure>
<hr>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div></pre></td><td class="code"><pre><div class="line">// registry <span class="keyword">for</span> system <span class="built_in">functions</span></div><div class="line">  private static final Registry system = new Registry(<span class="literal">true</span>);</div></pre></td></tr></table></figure>
<p>找到上面的registry for system functions代码块中的static{}添加register</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><div class="line">1</div></pre></td><td class="code"><pre><div class="line">system.registerUDF(<span class="string">"ToUpper"</span>, ToUpper.class, <span class="literal">false</span>);</div></pre></td></tr></table></figure>
<p>5.编译Hive</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div></pre></td><td class="code"><pre><div class="line"><span class="built_in">cd</span> /root/hive-1.1.0-cdh5.7.0</div><div class="line"><span class="comment">## 编译时间比较长</span></div><div class="line">mvn clean package -DskipTests -Phadoop-2 -Pdist</div></pre></td></tr></table></figure>
<p>编译成功后会生成一个hive程序包和目录的地址</p>
<p>包地址：<code>/root/hive-1.1.0-cdh5.7.0/packaging/target/apache-hive-1.1.0-cdh5.7.0-bin.tar.gz</code></p>
<p>目录地址：<code>/root/hive-1.1.0-cdh5.7.0/packaging/target/apache-hive-1.1.0-cdh5.7.0-bin</code></p>
<p>6.使集成的UDF函数生效</p>
<p>方法一：<br>替换hive-exec-1.1.0-cdh5.7.0.jar包</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div></pre></td><td class="code"><pre><div class="line">mv <span class="variable">$HIVE_HOME</span>/lib/hive-exec-1.1.0-cdh5.7.0.jar <span class="variable">$HIVE_HOME</span>/lib/hive-exec-1.1.0-cdh5.7.0.jar.bak</div><div class="line"></div><div class="line"></div><div class="line">cp /root/hive-1.1.0-cdh5.7.0/packaging/target/apache-hive-1.1.0-cdh5.7.0-bin/apache-hive-1.1.0-cdh5.7.0-bin/lib/hive-exec-1.1.0-cdh5.7.0.jar <span class="variable">$HIVE_HOME</span>/lib/</div></pre></td></tr></table></figure>
<p>方法二：<br>复制编译生成的包重新部署。</p>
<p>7.测试UDF</p>
<p>测试先我们要结束hive的runjar进程，使用<code>jps</code>命令查看，如果有再用<code>kill -9</code> 结束进程</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div></pre></td><td class="code"><pre><div class="line"><span class="comment">## 显示自定义UDF函数toupper信息</span></div><div class="line">hive (default)&gt; desc <span class="keyword">function</span> toupper;</div><div class="line">OK</div><div class="line">toupper(str) - Converts a string to uppercase</div><div class="line">Time taken: 0.014 seconds, Fetched: 1 row(s)</div><div class="line"></div><div class="line"><span class="comment">## 调用toupper函数</span></div><div class="line">hive (default)&gt; select toupper(<span class="string">'jimmy'</span>) from emp <span class="built_in">limit</span> 2;</div><div class="line">OK</div><div class="line">JIMMY</div><div class="line">JIMMY</div><div class="line">Time taken: 0.087 seconds, Fetched: 2 row(s)</div></pre></td></tr></table></figure>

        </div>

        <blockquote class="post-copyright">
    <div class="content">
        
<span class="post-time">
    最后更新时间：<time datetime="2017-10-08T06:39:23.586Z" itemprop="dateUpdated">2017-10-08 14:39:23</time>
</span><br>


        
        原始链接：<a href="/Hive/Hive--UDF.html" target="_blank" rel="external">http://xiejm.com/Hive/Hive--UDF.html</a>
        
    </div>
    <footer>
        <a href="http://xiejm.com">
            <img src="/img/avatar.jpg" alt="XieJM">
            XieJM
        </a>
    </footer>
</blockquote>

        


        <div class="post-footer">
            
	<ul class="article-tag-list"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Hive/">Hive</a></li></ul>


            
<div class="page-share-wrap">
    

<div class="page-share" id="pageShare">
    <ul class="reset share-icons">
      <li>
        <a class="weibo share-sns" target="_blank" href="http://service.weibo.com/share/share.php?url=http://xiejm.com/Hive/Hive--UDF.html&title=《Hive--UDF开发》 — XieJM's Blog&pic=http://xiejm.com/img/avatar.jpg" data-title="微博">
          <i class="icon icon-weibo"></i>
        </a>
      </li>
      <li>
        <a class="weixin share-sns wxFab" href="javascript:;" data-title="微信">
          <i class="icon icon-weixin"></i>
        </a>
      </li>
      <li>
        <a class="qq share-sns" target="_blank" href="http://connect.qq.com/widget/shareqq/index.html?url=http://xiejm.com/Hive/Hive--UDF.html&title=《Hive--UDF开发》 — XieJM's Blog&source=Linux | Hadoop | CDH | Hive | Hbase | Spark" data-title=" QQ">
          <i class="icon icon-qq"></i>
        </a>
      </li>
      <li>
        <a class="facebook share-sns" target="_blank" href="https://www.facebook.com/sharer/sharer.php?u=http://xiejm.com/Hive/Hive--UDF.html" data-title=" Facebook">
          <i class="icon icon-facebook"></i>
        </a>
      </li>
      <li>
        <a class="twitter share-sns" target="_blank" href="https://twitter.com/intent/tweet?text=《Hive--UDF开发》 — XieJM's Blog&url=http://xiejm.com/Hive/Hive--UDF.html&via=http://xiejm.com" data-title=" Twitter">
          <i class="icon icon-twitter"></i>
        </a>
      </li>
      <li>
        <a class="google share-sns" target="_blank" href="https://plus.google.com/share?url=http://xiejm.com/Hive/Hive--UDF.html" data-title=" Google+">
          <i class="icon icon-google-plus"></i>
        </a>
      </li>
    </ul>
 </div>



    <a href="javascript:;" id="shareFab" class="page-share-fab waves-effect waves-circle">
        <i class="icon icon-share-alt icon-lg"></i>
    </a>
</div>



        </div>
    </div>

    
<nav class="post-nav flex-row flex-justify-between">
  
    <div class="waves-block waves-effect prev">
      <a href="/Hive/Hive--Overview.html" id="post-prev" class="post-nav-link">
        <div class="tips"><i class="icon icon-angle-left icon-lg icon-pr"></i> Prev</div>
        <h4 class="title">Hive--概述</h4>
      </a>
    </div>
  

  
    <div class="waves-block waves-effect next">
      <a href="/Hive/Hive--install.html" id="post-next" class="post-nav-link">
        <div class="tips">Next <i class="icon icon-angle-right icon-lg icon-pl"></i></div>
        <h4 class="title">Hive--安装</h4>
      </a>
    </div>
  
</nav>



    


<section class="comments" id="comments">
    <div id="disqus_thread"></div>
    <script>
    var disqus_shortname = 'true';
    lazyScripts.push('//' + disqus_shortname + '.disqus.com/embed.js')
    </script>
    <noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
</section>













</article>



</div>

    </main>
    <div class="mask" id="mask"></div>
<a href="javascript:;" id="gotop" class="waves-effect waves-circle waves-light"><span class="icon icon-lg icon-chevron-up"></span></a>



<div class="global-share" id="globalShare">
    <ul class="reset share-icons">
      <li>
        <a class="weibo share-sns" target="_blank" href="http://service.weibo.com/share/share.php?url=http://xiejm.com/Hive/Hive--UDF.html&title=《Hive--UDF开发》 — XieJM's Blog&pic=http://xiejm.com/img/avatar.jpg" data-title="微博">
          <i class="icon icon-weibo"></i>
        </a>
      </li>
      <li>
        <a class="weixin share-sns wxFab" href="javascript:;" data-title="微信">
          <i class="icon icon-weixin"></i>
        </a>
      </li>
      <li>
        <a class="qq share-sns" target="_blank" href="http://connect.qq.com/widget/shareqq/index.html?url=http://xiejm.com/Hive/Hive--UDF.html&title=《Hive--UDF开发》 — XieJM's Blog&source=Linux | Hadoop | CDH | Hive | Hbase | Spark" data-title=" QQ">
          <i class="icon icon-qq"></i>
        </a>
      </li>
      <li>
        <a class="facebook share-sns" target="_blank" href="https://www.facebook.com/sharer/sharer.php?u=http://xiejm.com/Hive/Hive--UDF.html" data-title=" Facebook">
          <i class="icon icon-facebook"></i>
        </a>
      </li>
      <li>
        <a class="twitter share-sns" target="_blank" href="https://twitter.com/intent/tweet?text=《Hive--UDF开发》 — XieJM's Blog&url=http://xiejm.com/Hive/Hive--UDF.html&via=http://xiejm.com" data-title=" Twitter">
          <i class="icon icon-twitter"></i>
        </a>
      </li>
      <li>
        <a class="google share-sns" target="_blank" href="https://plus.google.com/share?url=http://xiejm.com/Hive/Hive--UDF.html" data-title=" Google+">
          <i class="icon icon-google-plus"></i>
        </a>
      </li>
    </ul>
 </div>


<div class="page-modal wx-share" id="wxShare">
    <a class="close" href="javascript:;"><i class="icon icon-close"></i></a>
    <p>扫一扫，分享到微信</p>
    <img src="" alt="微信分享二维码">
</div>




    <script src="//cdn.bootcss.com/node-waves/0.7.4/waves.min.js"></script>
<script>
var BLOG = { ROOT: '/', SHARE: true, REWARD: false };


</script>

<script src="/js/main.min.js?v=1.6.13"></script>


<div class="search-panel" id="search-panel">
    <ul class="search-result" id="search-result"></ul>
</div>
<template id="search-tpl">
<li class="item">
    <a href="{path}" class="waves-block waves-effect">
        <div class="title ellipsis" title="{title}">{title}</div>
        <div class="flex-row flex-middle">
            <div class="tags ellipsis">
                {tags}
            </div>
            <time class="flex-col time">{date}</time>
        </div>
    </a>
</li>
</template>

<script src="/js/search.min.js?v=1.6.13" async></script>






<script async src="//dn-lbstatics.qbox.me/busuanzi/2.3/busuanzi.pure.mini.js"></script>



<script>
(function() {
    var OriginTitile = document.title, titleTime;
    document.addEventListener('visibilitychange', function() {
        if (document.hidden) {
            document.title = 'XieJM's Blog';
            clearTimeout(titleTime);
        } else {
            document.title = 'XieJM's Blog';
            titleTime = setTimeout(function() {
                document.title = OriginTitile;
            },2000);
        }
    });
})();
</script>



</body>
</html>
